bitswap_unstable_stream added#3264
Conversation
| format!("Bitswap stream subscription: {} cids", cids.len()) | ||
| ); | ||
|
|
||
| match me.bitswap_service.bitswap_stream(cids).await { |
There was a problem hiding this comment.
Not sure if I'm right but is client connection blocked while awaiting here?
There was a problem hiding this comment.
The await covers only the synchronous handoff (parse_and_dedup + a channel send + a oneshot for the BatchId), not the actual Have broadcast or per-CID events - those flow later via the events-pump task pushed to background_tasks at line 1220 (light-base/src/json_rpc_service/background.rs).
The only backpressure could be the bitswap service's bounded inbound channel being saturated - I think this shall not happen.
Any ideas how to solve it in different way?
dmitry-markin
left a comment
There was a problem hiding this comment.
The implementation looks correct and carefully written.
One thing I would do differently though, is apply the fail-fast principle and return an error to the subscription request in case the Have request failed, instead of faning-out errors for all the individual CIDs. May be there is some justification for not doing so?
| // One Cancel wantlist message containing all pending CIDs, sent to every peer this | ||
| // batch's Have broadcast reached. Cancel for an unknown CID is harmless on the receiver | ||
| // side — the peer no-ops. | ||
| let message = build_bitswap_cancel_message(pending_cids.iter()); |
There was a problem hiding this comment.
As a side note — substrate Bitswap server implementation never tracks pending want-lists (i.e., if it doesn't have the requested CID, it will just respond "NO" and never get back to it again). So, currently, it doesn't make sense to send cancel messages. While being mostly harmful (nit the bandwidth usage), we should probably:
- document it here and write that the message is sent only for conformance with IPFS Bitswap protocol implementation in case it is later updated on substrate's side;
- consider "subscribing" to not yet available chunks in substrate to handle cases when the publisher submitted the transaction, communicated to the receiver the CID, and the receiver need to await while it is available in the blocks. This is the edge case that needs to be confirmed before implementing in substrate to not introduce extra code requiring maintenance in case it is not needed by anyone.
| // Invariant: parse_and_dedup caps entries.len() at MAX_CIDS_PER_REQUEST | ||
| // and events_tx is bounded(MAX_CIDS_PER_REQUEST). At most `total` items | ||
| // are ever pushed for a batch, so the buffer cannot saturate. |
There was a problem hiding this comment.
May be worth adding a note / TODO item here that not backpressuring network responses here leads to allocation of (64 * 2 MiB) = 128 MiB max, which is acceptable in a desktop environment, but can cause issue on mobile.
| /// Top-level errors: only the batch-input validation cases — `-32801 TooManyCids`, | ||
| /// `-32802 EmptyCids`, `-32803 DuplicateCids` — are surfaced at the top level. Wholesale | ||
| /// Have-broadcast failures (no peers connected / network send queue full) are NOT top-level | ||
| /// errors per the `bitswap_unstable_stream` spec: the subscription opens normally and the | ||
| /// failure fans out as one `streamItemError(-32812 FailRetryBackoff)` per remaining valid | ||
| /// CID, followed by `streamDone`. |
There was a problem hiding this comment.
Why it was chosen to not fail fast the entire subscription in case the have request failed, delivering individual errors instead?
| // total events ever pushed for this batch is bounded by entries.len(); the | ||
| // receiver is also held by BitswapStreamHandle so the channel cannot be Closed. | ||
| for (cid_str, err) in invalid_slots { | ||
| batch.pending_count -= 1; |
There was a problem hiding this comment.
nit: for uniformity with other places
| batch.pending_count -= 1; | |
| batch.pending_count = batch.pending_count.saturating_sub(1); |
This PR adds
bitswap_unstable_streammethod defined in paritytech/json-rpc-interface-spec#186It also improves the e2e test for
bitswap_unstable_getwhich was added in #3240